European Heart Journal - Digital Health — Latest Matching Preprints

1

Beyond Doppler: Scalable AI Detection of LVOT Obstruction in HCM

Crystal, O.; Farina, J. M. M.; Scalia, I. G.; Ayoub, C.; Park, H.-B.; Kim, K. A.; Arsanjani, R.; Lester, S. J.; Banerjee, I.

2026-04-20 cardiovascular medicine 10.64898/2026.04.17.26351151 medRxiv

Top 0.1%

25.9%

Show abstract

BackgroundAccurate assessment of left ventricular outflow tract (LVOT) gradients is critical for hypertrophic cardiomyopathy (HCM) management, yet Doppler-based measurements are technically demanding and require expertise. ObjectiveTo develop a multi-view deep learning model capable of classifying LVOT obstruction (> 20mmHg) using routine 2D echocardiographic windows without reliance on Doppler imaging. MethodsWe trained and externally validated a cross-attention-based video-to-video fusion framework that integrated EchoPrime-derived video representations from three standard transthoracic echocardiographic views to classify LVOT gradients. ResultsTraining was performed on a derivation cohort (N = 1833) from a tertiary care system in the United States, with model performance evaluated on an internal held-out test set (N = 275) and a Korean external validation cohort (N = 46). Single-view baselines showed limited discrimination (external AUROCs 0.47-0.70). Conversely, domain-specific foundational model (EchoPrime) achieved superior single-view performance (AUROCs 0.75-0.80 internal; 0.79-0.83 external), highlighting the importance of echo-specific pretraining and temporal modeling. The proposed multi-view fusion further enhanced predictive performance, with the late fusion model reaching an AUROC of 0.84 on the external cohort with significant population-shift. ConclusionsThese results suggest LVOT physiology is encoded in routine 2D imaging and can be leveraged for clinically relevant gradient classification without Doppler input- proposed AI-guided strategy demonstrates substantial cost savings compared with the screen-all approach. By integrating complementary spatial-temporal information across multiple views, our approach generalizes robustly across populations and may enable real-time decision support, extend LVOT assessment to portable or resource-limited settings, and complement Doppler-based evaluation for longitudinal HCM management.

2

PRAM: Post-hoc Retrieval Augmentation for Parameter-Free Domain Adaptation of ICU Clinical Prediction Models

Jeong, I.; Lee, T.; Kim, B.; Park, J.-H.; Kim, Y.; Lee, H.

2026-04-05 health systems and quality improvement 10.64898/2026.04.03.26350132 medRxiv

Top 0.1%

14.9%

Show abstract

Background Clinical prediction models degrade when deployed across hospitals, yet retraining requires technical expertise, labeled data, and regulatory re-approval. We investigated whether post-hoc retrieval augmentation of a frozen model's output, analogous to retrieval-augmented methods in natural language processing, can mitigate this degradation without any parameter modification. Methods We developed the Post-hoc Retrieval Augmentation Module (PRAM), which combines predictions from a frozen base model with outcome information retrieved from similar patients in a local patient bank. Five base models (logistic regression through CatBoost) and three retrieval strategies were evaluated on 116,010 ICU patients across three databases (MIMIC-IV, MIMIC-III, eICU-CRD) for acute kidney injury (AKI) and mortality prediction. A bank size deployment simulation modeled performance from zero to full local data accumulation, complemented by source bank cold start, stress tests, and calibration experiments. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC). Results Retrieval benefit was inversely associated with base model complexity ({rho} = -0.90 for AKI, -1.00 for mortality): simpler models benefited more, consistent with retrieval capturing residual signal unexploited by the base model. PRAM showed a statistically significant monotone dose-response between bank size and prediction performance across all six outcome-target combinations (Kendall {tau} trend test, q = 0.031 for all). At the pre-specified primary comparison (bank = 5,000), the improvement was confirmed for the two largest-shift settings (eICU-CRD AKI: {Delta}AUROC = +0.012, q < 0.001; eICU-CRD mortality: {Delta}AUROC = +0.026, q < 0.001). Pre-loading a source bank bridged the cold-start gap, providing an immediate performance gain equivalent to approximately 2,000 to 5,000 local patients. Conclusions PRAM provides a parameter-free adaptation mechanism that requires no model retraining, gradient computation, or regulatory re-evaluation at the deployment site. Effect sizes were modest and did not reach cross-model superiority, but the consistent dose-response pattern and the absence of retraining requirements establish retrieval-based adaptation as a viable approach for clinical model transportability. The retrieval mechanism additionally opens a pathway toward case-based interpretability, where predictions are accompanied by identifiable similar patients from the deploying institution.

3

Prognostic value of artificial intelligence-derived echocardiographic measurements in transthyretin cardiomyopathy

Walser, A.; Flammer, A. J.; Hundertmark, M. J.; Shiri, I.; Ciocca, N.; Ryffel, C.; de Marchi, S.; Schwotzer, R.; Ruschitzka, F.; Tanner, F. C.; Graeni, C.; Benz, D. C.

2026-04-02 cardiovascular medicine 10.64898/2026.04.01.26349281 medRxiv

Top 0.1%

14.4%

Show abstract

Background: Transthyretin cardiomyopathy (ATTR-CM) is a progressive, potentially fatal disease requiring accurate risk stratification. Echocardiography is the first-line imaging modality, with AI-based tools increasingly applied for automated analysis, yet their prognostic value remains unknown. Objectives: To examine the prognostic value of AI-derived echocardiographic measurements and their incremental value beyond biomarker staging in ATTR-CM. Methods: This retrospective study included patients from two ATTR-CM registries. Baseline echocardiograms were analyzed using the fully automated AI-based software Us2.ai. Prognostic performance was assessed by Kaplan-Meier analysis, Cox regression, and ROC curves. A two-parameter echocardiographic staging system combining left ventricular (LV) global longitudinal strain (GLS) and right ventricular (RV) fractional area change (FAC) stratified patients into low (both normal), intermediate (one abnormal), and high risk (both abnormal). Results: Among 347 patients (91% male, median age 78 years), 141 experienced all-cause death or heart failure hospitalization over a median follow-up of 2.4 years. In multivariable analysis, AI-derived LV-GLS (HR 1.13 [1.03-1.25], p=0.011) and RV FAC (HR 0.96 [0.93-0.99], p=0.014) were independent outcome predictors. Echo staging stratified risk into groups with 3-fold (95% CI 1.70-5.91) and 6-fold (95% CI 3.22-10.30) increased hazard compared to low risk (p<0.001), with incremental prognostic value beyond National Amyloidosis Centre (NAC) staging and age (chi-square from 53 to 80; p<0.001). AI and human measurements showed comparable 1-year predictive performance (all p>0.05). Conclusion: AI-derived echocardiographic measurements demonstrate independent and incremental prognostic value beyond biomarker-based NAC staging in ATTR-CM, comparable to human measurements, supporting their integration into clinical risk stratification.

4

Comparison of the Expert Guidelines With Artificial Intelligence-Driven Echocardiographic Assessment of Diastolic Function

Tokodi, M.; Kagiyama, N.; Pandey, A.; Nakamura, Y.; Akama, Y.; Takamatsu, S.; Toki, M.; Kitai, T.; Okada, T.; Lam, C. S.; Yanamala, N.; Sengupta, P.

2026-04-24 cardiovascular medicine 10.64898/2026.04.23.26350072 medRxiv

Top 0.1%

10.0%

Show abstract

Backgound: Accurate assessment of diastolic function and left ventricular (LV) filling pressure is central to heart failure diagnosis and risk stratification. Contemporary guideline algorithms rely on complex parameters that are not consistently available in routine clinical practice. Objective: To compare the diagnostic and prognostic performance of the 2016 American Society of Echocardiography/European Association of Cardiovascular Imaging (ASE/EACVI) and 2025 ASE guidelines with a deep learning model based on routinely acquired echocardiographic variables. Methods: This study evaluated the guideline-based algorithms and a deep learning model in participants from the Atherosclerosis Risk in Communities (ARIC) cohort (n=5450) for prognostication and two invasive hemodynamic validation cohorts from the United States (n=83) and Japan (n=130) for detection of elevated left ventricular filling pressure. Results: In the ARIC cohort, the deep learning model demonstrated superior prognostic performance compared with the 2016 and 2025 guidelines (C-index: 0.676 vs. 0.638 and 0.602, respectively; both p<0.001). Similar findings were observed among participants with preserved ejection fraction (C-index: 0.660 vs. 0.628 and 0.590; both p<0.001), with improved performance compared with the H2FPEF score (C-index: 0.660 vs. 0.607; p<0.001). In the US hemodynamic validation cohort, the deep learning model showed higher diagnostic performance than the 2025 guidelines (AUC: 0.879 vs. 0.822; p=0.041) and similar performance compared with the 2016 guidelines (AUC: 0.879 vs. 0.812; p=0.138). In the Japanese hemodynamic validation cohort, the deep learning model outperformed both guidelines (AUC: 0.816 vs. 0.634 and 0.694; both p<0.05). Conclusions: A deep learning model leveraging routinely available echocardiographic parameters demonstrated improved diagnostic and prognostic performance compared with contemporary guideline-based approaches, potentially offering a scalable alternative for assessing diastolic function and left ventricular filling pressures.

5

DIVAID: Consistent division of atrial geometries from multimodal imaging according to the EHRA/EACVI 15-segment bi-atrial model

Goetz, C.; Eichenlaub, M.; Schmidt, K.; Wiedmann, F.; Invers Rubio, E.; Martinez Diaz, P.; Luik, A.; Althoff, T.; Schmidt, C.; Loewe, A.

2026-04-23 cardiovascular medicine 10.64898/2026.04.22.26351448 medRxiv

Top 0.1%

9.8%

Show abstract

The recently published EHRA/EACVI consensus statement on a standardized bi-atrial regionalization provides new opportunities for consistent regional analyses across patients, imaging modalities and clinical centers. To make this standardized regionalization widely accessible, we developed the open-source software DIVAID, which automatically divides bi-atrial geometries according to the proposed regions, ensuring consistency, reproducibility and operator independence. We evaluated the accuracy of the algorithm by comparing its results to manual expert annotations across 140 geometries from multiple modalities and centers. Veins were automatically clipped correctly in 81% and orifices annotated correctly in 100% of cases. The median (interquartile range; IQR) Dice similarity coefficient (DSC) for left atrial regions was 0.98 (0.96-1.00) for DIVAID-expert and 0.98 (0.94-1.00) for inter-expert comparisons. For right atrial geometries, DSC was higher for DIVAID-expert than for inter-expert comparisons at 0.90 (0.80-0.95) and 0.88 (0.74-0.94), respectively. To assess the accuracy of regional boundaries, we computed the mean average surface distance (MASD) for boundaries derived from automatic or manual annotations. The median (IQR) MASD between DIVAID and experts was 0.17 mm (0.03-0.78) and 1.93 mm (0.65-3.96) in the left and right atrium, respectively. To conclude, DIVAID robustly divides anatomically diverse bi-atrial geometries according to the 15-segment model, while outperforming cardiac experts in both speed and consistency, and demonstrating an accuracy of regional boundaries comparable to the spatial resolution of cardiac imaging modalities. By providing automated, consistent atrial regionalization, DIVAID enables large-scale, standardized regional analyses and data-driven investigation of harmonized, multi-dimensional datasets, which may advance atrial arrhythmia research and personalized treatment strategies.

6

Hidden risk in normal myocardial perfusion scans: AI-detected proximal coronary calcium on CT attenuation maps improves prognosis

Zhou, J.; Miller, R. J.; Shanbhag, A.; Killekar, A.; Han, D.; Patel, K. K.; Pieszko, K.; Yi, J.; Urs, M. K.; Ramirez, G.; Lemley, M.; Kavanagh, P. B.; Liang, J. X.; Kamagate, A.; Builoff, V.; Einstein, A. J.; Feher, A.; Miller, E. J.; Sinusas, A. J.; Ruddy, T. D.; Knight, S.; Le, V. T.; Mason, S.; Chareonthaitawee, P.; Wopperer, S.; Alexanderson, E.; Carvajal-Juarez, I.; Rosamond, T. L.; Slipczuk, L.; Travin, M. I.; Packard, R. R.; Acampa, W.; Al-Mallah, M.; deKemp, R. A.; Buechel, R. R.; Berman, D. S.; Dey, D.; Di Carli, M. F.; Slomka, P. J.

2026-04-15 cardiovascular medicine 10.64898/2026.04.14.26350808 medRxiv

Top 0.1%

8.5%

Show abstract

Purpose: Spatial distribution of coronary artery calcium (CAC) may provide additional prognostic value in patients undergoing SPECT and PET myocardial perfusion imaging (MPI). We aimed to automatically identify CAC in proximal segments from attenuation correction CT (CTAC) scans using artificial intelligence (AI) and to evaluate prognostic significance in two large international multicenter registries. Methods: From hybrid MPI/CT imaging (N=43,099) across 15 sites, we included 4,552 most relevant patients with 1) no prior coronary artery disease; 2) AI-derived mild CAC scores (1-99); and 3) normal perfusion (stress total perfusion deficit <5%). The independent associations between AI-identified proximal CAC and major adverse cardiovascular events (MACE) and all-cause mortality (ACM) were evaluated using multivariable Cox regression, likelihood ratio test (LRT), and continuous net reclassification index (NRI). Results: Among the patients with mild CAC and normal perfusion (mean age 65{+/-}12 years, 51% male), 1,730 (38%) had proximal CAC. Over 3.6 (inter-quartile interval 2.1, 5.2) years follow up, 599 (13%) and 444 (10%) patients had MACE or ACM, respectively. Proximal CAC was associated with an increased risk of MACE (adjusted hazard ratio [HR] 1.24, 95% CI 1.03-1.48, P=0.02) and ACM (adjusted HR 1.25, 95% CI 1.01-1.53, P=0.04) after the adjustment of CAC score and density, clinical risk factors, and perfusion deficit. Proximal CAC improved the risk stratification of MACE (LRT P=0.02; NRI 12%) and ACM (LRT P=0.04; NRI 12%). Conclusion: In patients with mild CAC and normal perfusion, AI detection of proximal CAC identified a higher-risk group for adverse outcomes, highlighting its prognostic utility.

7

A Deep Learning-Based Single-View Echocardiographic Analysis for Prediction of Left Ventricular Outflow Tract Obstruction After Transcatheter Aortic Valve Replacement

Choi, J.-W.; Park, J.; Yoon, Y. E.; Kim, J.; Jeon, J.; Jang, Y.; Lee, S.-A.; Bak, M.; Choi, H.-M.; Hwang, I.-C.; Cho, G.-Y.

2026-03-30 cardiovascular medicine 10.64898/2026.03.27.26349567 medRxiv

Top 0.1%

8.3%

Show abstract

Aims: Dynamic left ventricular outflow tract obstruction (LVOTO) is a hemodynamically significant complication following transcatheter aortic valve replacement (TAVR) that remains difficult to predict with conventional transthoracic echocardiography (TTE). We examined whether a deep learning (DL) model developed for LVOTO detection in hypertrophic cardiomyopathy (HCM) could predict post-TAVR LVOTO from pre-TAVR TTE in patients with severe aortic stenosis (AS). Methods and Results: In this retrospective study of 302 consecutive patients undergoing TAVR for severe AS, a pre-trained DL model was applied to pre-TAVR TTE to generate a patient-level DL index of LVOTO (DLi-LVOTO; range 0-100). Post-TAVR LVOTO was defined as a peak pressure gradient [≥]30 mmHg on follow-up TTE. Logistic regression and receiver operating characteristic analyses assessed the association and discriminative performance of DLi-LVOTO. Pre-TAVR LVOTO was present in 32 patients (10.6%) and post-TAVR LVOTO in 35 (11.6%). Follow-up TTE was performed at a median of 47 days (IQR 37-63) after TAVR, with the majority of TTE (216 of 302, 71.5%) performed within 2 months. DLi-LVOTO was significantly higher in patients with LVOTO at both pre- and post-TAVR TTE (all p<0.001). In multivariable analysis, DLi-LVOTO remained independently associated with post-TAVR LVOTO even after adjusting for conventional TTE parameters and pre-TAVR LVOTO (adjusted OR 1.29, 95% CI 1.06-1.56 per 10-score increase, p=0.011), with an AUROC of 0.78 (95% CI 0.72-0.85). Among patients without pre-TAVR LVOTO, DLi-LVOTO retained independent predictive value (adjusted OR 1.56, 95% CI 1.19-2.06, p=0.001; AUROC 0.84, 95% CI 0.77-0.91). Conclusion: A DL model originally trained in HCM patients independently predicts post-TAVR LVOTO from pre-TAVR TTE, including in patients without pre-existing LVOTO, suggesting it captures hemodynamic features beyond conventional echocardiographic assessment.

8

Heart Failure Prediction & Risk Stratification using Machine Learning

Ali, S.; Leavitt, M. A.; Asghar, W.

2026-04-05 public and global health 10.64898/2026.04.03.26350139 medRxiv

Top 0.1%

8.1%

Show abstract

Heart failure (HF) is one of the most prevalent causes of morbidity, mortality, and healthcare expenditures, with approximately 6.7 million adults in the U.S. suffering from this condition and contributing to hundreds of thousands of deaths annually. Early diagnosis of high-risk individuals has been a challenge, as the HF-specific symptoms are often ignored or misinterpreted as normal aging, stress, or minor illnesses, leading to delayed diagnosis. We trained, tested, and evaluated several models, including logistic regression, SVM, KNN, random forest, XGBoost, MLP, and a custom stacked ensemble using stratified 5-fold CV and 70/30 hold-out splits for HF prediction on routinely available electronic medical record (EMR) data of the All of Us Research Program. This group consisted of 37,070 adults (13,577 HF; 23,493 non-HF). The predictors included readily available demographics, vital signs, conditions, and laboratory results. Preprocessing steps included IQR-winsorization, median imputation, one-hot encoding, and QuantileTransformer. The stacked model obtained ROC-AUC 0.927, PR-AUC 0.895, and accuracy 0.856 in the test set. To support real-world deployment, we calibrated predicted probabilities and adjusted them to a realistic population prevalence, yielding interpretable probability estimates and clear stratification of individuals into clinically actionable risk tiers. SHAP analysis identified the most influential features, namely, atrial fibrillation, age, hypertensive disorder, sodium, and deprivation index, as the top 5 features impacting the model?s prediction. A secondary multiclass experiment (No-HF, HF with reduced ejection fraction, and HF with preserved ejection fraction) was performed, achieving lower discrimination results (macro-AUC ~0.87) and a lower per-class precision/recall, presumably due to label noise, class imbalance, and overlapping phenotypes. We have demonstrated that a carefully calibrated stacked ensemble on the combination of readily available EMR variables can achieve strong discrimination on HF, making it an effective tool for an AI clinical decision support system (AI-CDSS) in population screening and proactive care pathways.

9

Prediction of Major Clinical Endpoints in Atrial Fibrillation at Primary Care Level using Longitudinal Learning Stances

Anjos, H.; Lebreiro, A.; Gavina, C.; Henriques, R.; Costa, R. S.

2026-03-27 cardiovascular medicine 10.64898/2026.03.26.26349389 medRxiv

Top 0.1%

7.3%

Show abstract

Atrial fibrillation (AF) is the most prevalent cardiac arrhythmia worldwide and is strongly associated with increased risks of stroke, heart failure, and mortality. Traditional methods to predict AF and prognostic its associated risks often fail to capture the full complexity of AF patterns, limiting their predictive accuracy. In spite of the improvements achieved by machine learning (ML) techniques, state-of-the-art AF-focused predictors do not generally incorporate longitudinal data, reducing their capacity to model the dynamic and evolving nature of individual behaviors and physiological indicators over time. The absence of a longitudinal perspective restricts understanding of how AF risk develops and changes across prognostic windows. This study addresses these limitations by developing superior ML models tailored to predict adverse events within a longitudinal Portuguese cohort of individuals with AF. The work targets six clinical endpoints: stroke, all-cause death, cardiovascular death, heart failure hospitalizations, inpatient visits, and acute coronary syndrome. The predictors yielded an AUC of 0.65 for 1-year stroke prediction, outperforming CHA2DS_2-VASc (0.59). For all-cause mortality prediction, the models achieved an AUC of 0.78 against the 0.72 reference of GARFIELD-AF. In addition to predictive advances, the study identifies determinants of AF-related risks and introduces a prototype decision-support tool for clinical use.

10

Electronic Health Record-Based Estimation of Kansas City Cardiomyopathy Questionnaire Scores in Heart Failure

Kim, Y. W.; Lau, W.; Patel, N.; Kendrick, K.; Wu, A.; Feldman, T.; Ahern, R.; Oka, A.

2026-04-05 health informatics 10.64898/2026.04.03.26350138 medRxiv

Top 0.1%

6.8%

Show abstract

Background: The Kansas City Cardiomyopathy Questionnaire (KCCQ) is a validated patient-reported outcome measure for heart failure. However, its clinical utility is limited by incomplete and inconsistent data collection. We aimed to develop and validate machine learning models to estimate KCCQ overall summary scores from electronic health record (EHR) data. Methods: We assembled a retrospective cohort of 10,889 heart failure patients with recorded KCCQ scores from the Truveta database. Predictor features were derived from structured EHR variables across 13 historical time windows (15-360 days). Multiple regression algorithms were evaluated, followed by SHapley Additive exPlanations (SHAP)-based feature reduction and nested cross-validation for hyperparameter optimization. Model performance was assessed using the coefficient of determination (R2), mean absolute error (MAE), and ordinal discrimination and calibration for categorical severity classification. Results: Histogram-based gradient boosting (HGB) with HGB-SHAP feature selection achieved the strongest performance, reducing feature dimensionality by more than 94\% while maintaining estimation accuracy. The 240-day window performed best (R2=0.522, MAE=12.485). For categorical severity classification, the model demonstrated strong ordinal discrimination (mean ordinal AUROC=0.850). Quantile-based calibration improved classification balance, increasing the F1-score for the most severe category (KCCQ<25) from 0.180 to 0.428 and the quadratic weighted kappa from 0.601 to 0.640. Longer EHR observation windows were associated with improved prediction performance. Conclusion: Machine learning models can estimate KCCQ scores from routine EHR data with clinically meaningful accuracy and strong discriminatory performance. This approach may help extend assessment of patient-reported health status to populations in which survey-based data are incompletely captured, supporting population-level cardiovascular outcomes assessment and risk stratification in heart failure care.

11

VIsual STAndardized Quantification of LGE (VISTAQ), a contour-less method for late gadolinium enhancement quantification

Aquaro, G. D.; Licordari, R.; De Gori, C.; Todiere, G.; Ianni, U.; Barison, A.; De Luca, A.; Folgheraiter, a.; Grigoratos, C.; alberti, m.; lombardo, m.; De Caterina, R.; Sinagra, G.; Emdin, M.; Di Bella, G.; fulceri, l.

2026-04-15 cardiovascular medicine 10.64898/2026.04.09.26350552 medRxiv

Top 0.1%

6.7%

Show abstract

Background: Late gadolinium enhancement (LGE) quantification by cardiovascular magnetic resonance is central to risk stratification in hypertrophic cardiomyopathy (HCM), yet conventional techniques require contour tracing and region-of-interest (ROI) placement, which may reduce reproducibility and increase analysis time. We developed a novel visual standardized approach, the Visual Standardized Quantification of LGE (VISTAQ), that does not require myocardial contouring, arbitrary ROI positioning, or dedicated post-processing software. Methods: In this multicenter, multivendor retrospective study, LGE images from 400 patients (100 prior myocardial infarction, 250 HCM, 50 other non-ischemic heart diseases) were analyzed. VISTAQ subdivides each myocardial segment into transmural mini-segments and classifies LGE visually using predefined criteria, expressing global LGE burden as the percentage of positive mini-segments. Reproducibility was assessed in 250 patients across different observer expertise levels using intraclass correlation coefficients (ICC) and Bland?Altman analysis. In 100 HCM patients, VISTAQ was compared with conventional methods (mean+2SD, +5SD, +6SD, FWHM, visual thresholding). Prognostic performance was evaluated in 250 HCM patients over a median 5-year follow-up. Results: VISTAQ demonstrated excellent intra- and inter-observer reproducibility (ICC up to 0.98 and 0.97, respectively), consistent across disease subtypes. Compared with conventional techniques, VISTAQ showed similar ICC to FWHM but significantly lower net and absolute inter-observer differences (median absolute difference 1.3%). Mean+2SD markedly overestimated LGE, whereas mean+6SD slightly underestimated LGE compared with VISTAQ, mean+5SD, FWHM, and visual thresholding. Analysis time was substantially shorter with VISTAQ (median 105 vs. 375 seconds, p<0.0001). During follow-up, 21 hard cardiac events occurred in HCM population. An LGE threshold >10% predicted events with higher accuracy using VISTAQ (AUC 0.90; sensitivity 85%; specificity 94%) compared with mean+6SD (AUC 0.75; sensitivity 57%; specificity 93%). Conclusions: VISTAQ provides highly reproducible, time-efficient LGE quantification without dedicated software and demonstrates non-inferior prognostic discrimination in HCM compared with conventional threshold-based techniques.

12

Multimodal Integration of Ambulatory ECG and Clinical Features for Sudden Cardiac Death and Pump Failure Death Prediction

Swee, S.; Adam, I.; Zheng, E. Y.; Ji, E.; Wang, D.; Speier, W.; Hsu, J.; Chang, K.-W.; Shivkumar, K.; Ping, P.

2026-04-22 cardiovascular medicine 10.64898/2026.04.21.26351421 medRxiv

Top 0.1%

6.4%

Show abstract

Ambulatory electrocardiograms (ECG) provides continuous monitoring of the hearts electrical activity. However, many existing machine learning and artificial intelligence models for analyzing ambulatory ECG traces are often unimodal and do not incorporate patient clinical context. In this study, we propose a multimodal framework integrating ambulatory ECG-derived representations with clinical text embeddings to predict two cardiac outcomes: sudden cardiac death and pump failure death. Ambulatory ECG traces are preprocessed, segmented, and encoded via a multiple instance learning and temporal convolutional neural network framework. In parallel, patient clinical features are parsed into structured prompts, which are passed through a large language model to generate clinical reasoning; this reasoning passes through a biomedical language encoder to generate a text embedding. With the ECG and text embeddings, we systematically evaluate multiple fusion strategies, including concatenation- and gating-based approaches, to integrate these two data modalities. Our results demonstrate that multimodal models consistently outperform unimodal baselines, with adaptive fusion mechanisms providing the greatest improvements in predictive performance. Decision curve analysis highlights the potential clinical utility of the proposed framework for risk stratification. Finally, we visualize model attention across modalities, including ECG attention patterns, segment-level saliency, heart rate variability features, and clinical reasoning, to contextualize patient-specific predictions.

13

Vision Language Model for Coronary Angiogram Analysis and Report Generation: Development and Evaluation Study

Jiang, Q.; Ke, Y.; Sinisterra, L. G.; Elangovan, K.; Li, Z.; Yeo, K. K.; Jonathan, Y.; Ting, D. S. W.

2026-04-21 cardiovascular medicine 10.64898/2026.04.19.26351241 medRxiv

Top 0.1%

6.4%

Show abstract

Coronary artery disease is a leading cause of morbidity and mortality. Invasive coronary angiography is currently the gold standard in disease diagnosis. Several studies have attempted to use artificial intelligence (AI) to automate their interpretations with varying levels of success. However, most existing studies cannot generate detailed angiographic reports beyond simple classification or segmentation. This study aims to fine-tune and evaluate the performance of a Vision-Language Model (VLM) in coronary angiogram interpretation and report generation. Using twenty-thousand angiogram keyframes of 1987 patients collated across four unique datasets, we finetuned InternVL2-4B model with Low-Rank Adaptor weights that can perform stenosis detection, anatomy labelling, and report generation. The fine-tuned VLM achieved a precision of 0.56, recall of 0.64, and F1-score of 0.60 for stenosis detection. In anatomy segmentation, it attained a weighted precision of 0.50, recall of 0.43, and F1-score of 0.46, with higher scores in major vessel segments. Report generation integrating multiple angiographic projection views yielded an accuracy of 0.42, negative predictive value of 0.58 and specificity of 0.52. This study demonstrates the potential of using VLM to streamline angiogram interpretation to rapidly provide actionable information to guide management, support care in resource-limited settings, and audit the appropriateness of coronary interventions. AUTHOR SUMMARYCoronary artery disease has heavy disease burden worldwide and coronary angiogram is the gold standard imaging for its diagnosis. Interpreting these complex images and producing clinical reports require significant expertise and time. In this study, we fine-tuned and investigated an open-source VLM, InternVL2-4B, to interpret and report coronary angiogram images in key tasks including stenosis detection, anatomy identification, as well as full report generation. We also referenced the fine-tuned InternVL2-4B against state-of-the-art segmentation model, YOLOv8x, which was evaluated on the same test sets. We examined how machine learning metrics like the intersection over union score may not fully capture the clinical accuracy of model predictions and discussed the limitations of relying solely on these metrics for evaluating clinical AI systems. Although the model has not yet achieved expert-level interpretation, our results demonstrate the potential and feasibility of automating the reporting of coronary angiograms. Such systems could potentially assist cardiologists by improving reporting efficiency, highlightning lesions that may require review, and enabling automated calculations of clinical scores such as the SYNTAX score.

14

Causal Machine Learning for Comparative Effectiveness of GLP-1 RA versus SGLT2i in Heart Failure Using Real-World EHR Data

Han, G. Y.; Kalogeropoulos, A. P.; Butzin-Dozier, Z.; Wong, R.; Wang, F.

2026-04-07 cardiovascular medicine 10.64898/2026.04.06.26350259 medRxiv

Top 0.1%

6.2%

Show abstract

Clinicians lack precision medicine tools to estimate individualized treatment effects for patients with heart failure (HF). Causal machine learning leveraging electronic health records can estimate both average and individualized treatment effects, enabling estimation of treatment heterogeneity. Using Stony Brook University Hospital data, we compared the effectiveness of glucagon-like peptide-1 receptor agonists (GLP-1 RA) versus sodium-glucose cotransporter 2 inhibitors (SGLT2i) in patients with HF. Under a doubly robust framework, we found a stable population-average effect: GLP-1 RA was associated with a lower risk than SGLT2i for a 1-year composite outcome of all-cause mortality or HF-related hospitalization. Heterogeneity analyses provided limited evidence for individualized treatment selection, although subgroup tests identified loop diuretic use, body mass index, and estimated glomerular filtration rate as potential effect modifiers. While these models hold promise for translating observational data into actionable precision care, careful assessment of causal assumptions and rigorous validation are essential before clinical implementation.

15

Papillary muscles, ventricular loading, and atrial remodelling as beat-to-beat determinants of functional mitral regurgitation: an exploratory Granger causality study

Eotvos, C. A.; Avram, T.; Blendea, E. D.; Munteanu, M. I.; Bubuianu, A. F.; Moldovan, M. P.; Hedesiu, P.; Lazar, R. D.; Zehan, I. G.; Sarb, A. D.; Coseriu, G.; Schiop-Tentea, P.; Mocan-Hognogi, D. L.; Chiorescu, R.; Pop, S.; Diosan, L.; Heist, E. K.; Blendea, D.

2026-04-05 cardiovascular medicine 10.64898/2026.04.03.26350122 medRxiv

Top 0.1%

4.3%

Show abstract

Background Functional mitral regurgitation results from interacting mechanisms whose relative contributions vary between atrial and ventricular subtypes and shift dynamically within each heartbeat, producing temporal patterns that static analyses cannot capture. Objectives To identify which structural determinants predict mitral regurgitation variability beat to beat using Granger causality within vector autoregression, focusing on papillary muscle dynamics across subtypes. Methods Frame-level echocardiographic time series from 41 patients (21 atrial, 20 ventricular; 1,959 frames) were z-score standardised within patient. Individual (lag 3) and pooled (lag 2) vector autoregression models tested whether left ventricular volume, left atrial volume, papillary muscle length, and annulus diameter Granger-predict mitral regurgitation area. Results Individual models revealed marked heterogeneity. In pooled analysis, left ventricular volume was the strongest Granger predictor at short lags (atrial p=0.011; ventricular p=0.006), while left atrial volume emerged at longer lags (lag 7: atrial p=0.043; ventricular p=0.011). Systolic papillary muscle length was not predictive. Full-cycle analysis revealed a subtype-specific dissociation: papillary muscle length Granger-predicted regurgitation only in the ventricular subtype (p=0.001), while regurgitation predicted papillary muscle displacement only in the atrial subtype (p<0.001). Left ventricular volume dominated within-beat prediction but lost cross-beat relevance in the ventricular subtype, while left atrial volume gained cross-beat predictive relevance in the atrial subtype. No structural determinant correlated with severity cross-sectionally. Conclusions Beat-to-beat vector autoregression and Granger modelling reveals heterogeneous, subtype-specific temporal patterns with distinct temporal windows of predictability for ventricular loading and papillary geometry. This framework may support patient-specific temporal phenotyping of functional mitral regurgitation.

16

Diastolic Age: A Cardiac Biological Clock Derived from Echocardiography and the PREVENT Heart Failure Risk Score

Fahed, G.; Cauwenberghs, N.; Santana, E. J.; Chen, R.; Celestin, B. E.; Gomes Botelho Quintas, B. F.; Short, S.; Carroll, M.; Miyoshi, T.; Alexander, K. M.; Shah, S. H.; Orr, S. S.; Kovacs, A.; Daubert, M. A.; Kuznetsova, T.; Addetia, K.; Asch, F. M.; Mahaffey, K. W.; Douglas, P. S.; Haddad, F.

2026-04-17 cardiovascular medicine 10.64898/2026.04.15.26350995 medRxiv

Top 0.1%

4.1%

Show abstract

Background: Among cardiac measures, diastolic parameters demonstrate the earliest and most consistent age-related changes. This can be leveraged to develop a continuous left ventricular (LV) Diastolic Age from routine echocardiographic parameters. Analogous to how epigenetic clocks weight molecular markers against mortality risk, we calibrated Diastolic Age by weighting echocardiographic features against the validated PREVENT-Heart Failure (HF) risk score. Methods: We analyzed 1,952 participants from the Project Baseline Health Study (median age 50 [36-64] years, 54% female). The measure was derived using partial least-squares regression anchored on PREVENT-HF and calibrated within a healthy reference subgroup. External validation was performed in the WASE (n=1,708) and Stanford Cardiovascular Aging (n=313) cohorts. Associations with ASE-defined LV diastolic dysfunction (LVDD), epigenetic clocks, and major adverse cardiovascular events (MACE) were examined. Results: Diastolic Age correlated strongly with chronological age (r=0.78) with robust external validation (WASE r=0.76; Stanford r=0.82; calibration slopes {approx}1.0). It increased progressively across grades of diastolic dysfunction and discriminated LVDD with an AUC of 0.89 (95% CI 0.87-0.92), and was independently associated with hypertension, diabetes, and elevated C-reactive protein. While correlated with the Levine (r=0.76) and Horvath (r=0.41) epigenetic clocks, residual analyses indicated that Diastolic Age captures a distinct cardiac-specific dimension of biological aging. Over median follow-up of 4.2 years, it independently predicted MACE (HR 2.30, 95% CI 1.70-3.18), with accelerated diastolic aging across all age groups among those with events. Discrimination was comparable to ASE-defined LVDD (C-index 0.83 vs. 0.82). Conclusion: Diastolic Age provides a continuous, echocardiography-derived measure of cardiac biological aging that complements categorical diastolic grading and epigenetic aging clocks, and independently predicts cardiovascular outcomes.

17

SMART-HF: Structured Management Approach to Remote Treatment of Heart Failure Associated With Predictable Hemodynamic Improvements In A Community Remote Pulmonary Artery Pressure Monitoring Program

Atzenhoefer, M.; Nelson, B.; Atzenhoefer, T. E.; Staudacher, M.; Boxwala, H.; Iqbal, F. M.

2026-04-16 cardiovascular medicine 10.64898/2026.04.12.26350637 medRxiv

Top 0.1%

4.1%

Show abstract

Aims: Responses to remote pulmonary artery pressure data vary across programs. We evaluated SMART-HF, a structured pulmonary artery diastolic pressure (PAD)-guided workflow, in a community heart failure cohort. Methods: We retrospectively analysed adults with heart failure and an implanted pulmonary artery pressure sensor managed with SMART-HF. Pulmonary artery diastolic pressure (PAD) was calculated from prespecified 14-day windows at baseline, 90 days, and 6 months. Two hemodynamic management performance indices (HMPI) were prespecified: the 6-Month Delta HMPI (PAD reduction >2 mmHg from baseline) and the 90-Day Target HMPI (PAD [≤]20 mmHg at 90 days). Exploratory analyses evaluated patients with baseline PAD >20 mmHg. Results: Of 37 patients, 36 had paired 90-day and 29 had paired 6-month windows. Mean PAD decreased from 18.3 +/- 7.0 to 16.1 +/- 6.3 mmHg at 90 days and from 18.8 +/- 6.8 to 15.5 +/- 5.8 mmHg at 6 months (both P < 0.001). The 90-Day Target HMPI was achieved in 26/36 (72.2%) and the 6-Month Delta HMPI in 19/29 (65.5%) [95% CI 45.7-82.1]. In the exploratory subgroup (baseline PAD >20 mmHg), mean PAD changes were -2.9 +/- 3.6 mmHg at 90 days (n = 19; P = 0.002) and -4.9 +/- 4.9 mmHg at 6 months (n = 15; P = 0.002). Conclusions: SMART-HF was associated with improved ambulatory pulmonary artery diastolic pressure control at 90 days and 6 months. Exploratory subgroup findings support further evaluation in patients with elevated baseline pulmonary artery diastolic pressure.

18

CorSeg-CineSAX: An Open-Source Deep Learning Framework for Fully Automatic Segmentation of Short-Axis Cine Cardiac MRI Across Multiple Cardiac Diseases

Xu, R.; Jiang, S.; Zhai, Y.; Chen, Y.

2026-04-03 cardiovascular medicine 10.64898/2026.04.01.26349955 medRxiv

Top 0.1%

4.0%

Show abstract

Background: Segmentation of the left ventricular myocardium, left ventricular cavity, and right ventricular cavity on short-axis cine cardiac magnetic resonance (CMR) images is essential for quantifying cardiac structure and function. However, existing automated segmentation tools are limited by small training datasets, narrow disease coverage, restrictive input format requirements, and the absence of anatomical plausibility constraints, hindering their clinical adoption. Methods: We constructed the largest annotated CMR short-axis segmentation dataset to date, comprising 1,555 subjects from 12 centers with five cardiac disease types and full cardiac cycle annotations totaling 319,175 labeled images. A MedNeXt-L model was trained using a 2D slice-by-slice strategy with full field-of-view input, eliminating dependencies on 3D volumes, temporal sequences, or region-of-interest(ROI) localization. A deterministic three-step post-processing pipeline was designed to enforce anatomical priors: connected component constraint, containment relationship constraint, and gap-filling constraint. The model was validated on an internal test set (310 subjects) and three independent public external datasets (ACDC, M&Ms1, M and Ms2; 855 subjects from 6 additional centers across 3 countries), spanning 15 cardiac disease categories-10 of which were never encountered during training. Results: The model achieved mean Dice similarity coefficients (DSC) of 0.913 {+/-} 0.037 and 0.911 {+/-} 0.040 on internal and external test sets, respectively, with a cross-domain performance gap of only 0.002. Post-processing eliminated all containment violations (7.5% [->] 0%) and gap errors (1.8% [->] 0%) while reducing fragment rates by 85.5% (9.0% [->] 1.3%). Zero-shot generalization to 10 unseen disease categories yielded DSC values ranging from 0.899 to 0.921. Automated clinical functional parameters demonstrated excellent agreement with manual measurements for left ventricular indices and right ventricular volumes (intraclass correlation coefficients [≥] 0.977). Conclusions: CorSeg-CineSAX provides a robust, open-source framework for fully automatic CMR short-axis segmentation across diverse clinical scenarios. All source code and pre-trained weights are publicly available at https://github.com/RunhaoXu2003/CorSeg.

19

Identifying and replicating plasma proteins associated with hypertrophic cardiomyopathy severity in carriers of pathogenic MYBPC3 variants

Hassanzada, F.; van Vugt, M.; Jansen, M.; Baas, A.; te Riele, A. S.; Dooijes, D.; van der Crabben, S. N.; Jongbloed, J. D.; Cox, M. G.; Amin, A. S.; Lekanne Deprez, R. H.; Ruijsink, B.; Kuster, D. W.; van der Velden, J.; Bezzina, C. R.; Asselbergs, F. W.; van Tintelen, J. P.; van Spaendonck-Zwarts, K. Y.; Schmidt, A. F.

2026-03-30 cardiovascular medicine 10.64898/2026.03.28.26349616 medRxiv

Top 0.1%

4.0%

Show abstract

Background. Hypertrophic cardiomyopathy (HCM) is a clinically variable disease in terms of onset and progression. Pathogenic MYBPC3 variants account for a substantial proportion of HCM diagnoses. This study sought to identify protein biomarkers associated with HCM severity. Methods. Olink-assayed plasma proteins of 144 MYBPC3 pathogenic variant carriers were tested for associations with HCM severity based on HCM diagnostic criteria (unaffected, mildly, or severely affected). The UK Biobank was used to replicate the identified proteins through considering time to onset of HCM (67 cases), cardiomyopathy (156 cases),and associations with cardiac MRI derived left ventricular maximum wall thickness (6,492 participants). Replicated proteins were further prioritised based on cardiac tissue expression and druggability, and annotated using pathway enrichment and association with onset of: heart failure (HF), dilated cardiomyopathy (DCM), sudden cardiac arrest (SCA), and ventricular arrhythmias (VA). Results. Among pathogenic MYBPC3 variant carriers, we identified 27 proteins associated with HCM severity. We independently replicated 21 proteins in the UK Biobank. Of the five prioritised proteins (NT-proBNP, GDF-15, FGF-23, ADM, and NCAM1), all but NT-proBNP were targeted by drugs with repurposing potential. The replicated proteins additionally associated with the incidence of HF (n=5), DCM (n=4), SCA (n=4), and VA (n=4). Conclusion. This study replicated 21 and prioritised five proteins associated with HCM severity in pathogenic MYBPC3 variant carriers. Replication in unselected HCM suggests the prioritised proteins are associated with HCM independent of genotype, providing important leads for plasma-based markers for diagnoses, disease monitoring, and drug targets.

20

Machine learning-based advanced coronary artery disease pretest probability model: Comparison with conventional pretest probability models

Hong, Y.; Lee, J.; Park, H.-B.; Kim, W.; Yoon, Y. E.; Jeong, H.; Kim, G.; So, B.; Lee, J.; Dalakoti, M.; Sung, J. M.; Kook, W.; Chang, H.-J.

2026-03-27 cardiovascular medicine 10.64898/2026.03.25.26348861 medRxiv

Top 0.2%

3.9%

Show abstract

Background: Pretest probability (PTP) models using clinical risk factors guide decision-making for coronary artery disease (CAD). Existing models (Updated Diamond-Forrester [UDF] and CAD Consortium [CAD2]) exhibit suboptimal predictive efficacy in Asian populations due to ethnic differences in atherosclerosis and risk profiles. We developed an advanced CAD-specific PTP model using ridge-penalized logistic regression and validated its reliability. Methods: Utilizing data from 4,696 Korean patients (3 trials and 2 cohorts), we employed ridge regression to develop an advanced PTP model (K-CAD) for identifying patients with CAD with >=50% diameter stenosis, determined using coronary computed tomography or invasive coronary angiography. External validation used datasets from another tertiary center (External Validation Cohort 1, n=428) and a nationwide health checkup cohort (External Validation Cohort 2, n=117,294). We compared K-CAD with existing models using continuous receiver operating characteristic (ROC) and ternary net reclassification improvement (NRI) analyses. Findings: Continuous ROC analysis in External Validation Cohort 1 revealed areas under the curves (AUCs) for UDF, 0.68 (95% confidence interval [CI] 0.63-0.73); CAD2, 0.71 (95%CI 0.67-0.76), and K-CAD, 0.76 (95%CI 0.71-0.80). K-CAD significantly outperformed UDF (p <0.001) and CAD2 (p <0.05). NRI analysis demonstrated that K-CAD improved reclassification of non-obstructive patients into low-risk categories. External validation using the nationwide dataset (surrogate endpoint: ICD-10 I20) yielded AUCs for UDF, 0.61 (95% CI 0.58-0.64); CAD2, 0.66 (95%CI 0.63-0.69); and K-CAD, 0.67 (95%CI 0.64-0.70). Interpretation: The study demonstrated K-CAD's utility employing extensive high-quality datasets, highlighting its potential for predicting CAD risk in the Korean population.